AITopics | delta model

Collaborating Authors

delta model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

9856b5d30ac61ab744fdab6f67d874e4-Paper-Conference.pdf

Neural Information Processing SystemsFeb-16-2026, 20:31:25 GMT

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

Asia > China (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > Dominican Republic (0.04)
(5 more...)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.76)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Cross-model Control: Improving Multiple Large Language Models in One-time Training

Neural Information Processing SystemsOct-10-2025, 10:39:05 GMT

However, how to reuse the fine-tuning outcomes of one model to other models to reduce training costs remains a challenge.

delta model, language model, llm, (16 more...)

Neural Information Processing Systems

Country:

Asia > China (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > Dominican Republic (0.04)
(5 more...)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Cross-model Control: Improving Multiple Large Language Models in One-time Training

Wu, Jiayi, Sun, Hao, Cai, Hengyi, Su, Lixin, Wang, Shuaiqiang, Yin, Dawei, Li, Xiang, Gao, Ming

arXiv.org Artificial IntelligenceOct-23-2024

The number of large language models (LLMs) with varying parameter scales and vocabularies is increasing. While they deliver powerful performance, they also face a set of common optimization needs to meet specific requirements or standards, such as instruction following or avoiding the output of sensitive information from the real world. However, how to reuse the fine-tuning outcomes of one model to other models to reduce training costs remains a challenge. To bridge this gap, we introduce Cross-model Control (CMC), a method that improves multiple LLMs in one-time training with a portable tiny language model. Specifically, we have observed that the logit shift before and after fine-tuning is remarkably similar across different models. Based on this insight, we incorporate a tiny language model with a minimal number of parameters. By training alongside a frozen template LLM, the tiny model gains the capability to alter the logits output by the LLMs. To make this tiny language model applicable to models with different vocabularies, we propose a novel token mapping strategy named PM-MinED. We have conducted extensive experiments on instruction tuning and unlearning tasks, demonstrating the effectiveness of CMC. Our code is available at https://github.com/wujwyi/CMC.

artificial intelligence, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2410.17599

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > Dominican Republic (0.04)
Asia > Singapore (0.04)
(5 more...)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology > Security & Privacy (0.66)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Characterizing Disparity Between Edge Models and High-Accuracy Base Models for Vision Tasks

Wang, Zhenyu, Nirjon, Shahriar

arXiv.org Artificial IntelligenceJul-13-2024

Edge devices, with their widely varying capabilities, support a diverse range of edge AI models. This raises the question: how does an edge model differ from a high-accuracy (base) model for the same task? We introduce XDELTA, a novel explainable AI tool that explains differences between a high-accuracy base model and a computationally efficient but lower-accuracy edge model. To achieve this, we propose a learning-based approach to characterize the model difference, named the DELTA network, which complements the feature representation capability of the edge network in a compact form. To construct DELTA, we propose a sparsity optimization framework that extracts the essence of the base model to ensure compactness and sufficient feature representation capability of DELTA, and implement a negative correlation learning approach to ensure it complements the edge model. We conduct a comprehensive evaluation to test XDELTA's ability to explain model discrepancies, using over 1.2 million images and 24 models, and assessing real-world deployments with six participants. XDELTA excels in explaining differences between base and edge models (arbitrary pairs as well as compressed base models) through geometric and concept-level analysis, proving effective in real-world applications.

accuracy, base model, edge model, (16 more...)

arXiv.org Artificial Intelligence

2407.10016

Country: North America > United States > North Carolina (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Reasoning over Description Logic-based Contexts with Transformers

Poulis, Angelos, Tsalapati, Eleni, Koubarakis, Manolis

arXiv.org Artificial IntelligenceNov-15-2023

One way that the current state of the art measures the reasoning ability of transformer-based models is by evaluating accuracy in downstream tasks like logical question answering or proof generation over synthetic contexts expressed in natural language. However, most of the contexts used are in practice very simple; in most cases, they are generated from short first-order logic sentences with only a few logical operators and quantifiers. In this work, we seek to answer the question how well a transformer-based model will perform reasoning over expressive contexts. For this purpose, we construct a synthetic natural language question-answering dataset, generated by description logic knowledge bases. For the generation of the knowledge bases, we use the expressive language $\mathcal{ALCQ}$. The resulting dataset contains 384K examples, and increases in two dimensions: i) reasoning depth, and ii) length of sentences. We show that the performance of our DeBERTa-based model, DELTA$_M$, is marginally affected when the reasoning depth is increased and it is not affected at all when the length of the sentences is increasing. We also evaluate the generalization ability of the model on reasoning depths unseen at training, both increasing and decreasing, revealing interesting insights into the model's adaptive generalization abilities.

dataset, natural language, quantifier, (15 more...)

arXiv.org Artificial Intelligence

2311.08941

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Georgia > Clarke County > Athens (0.04)
(7 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Description Logic (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback